A novel broad spectrum venom metalloproteinase autoinhibitor in the rattlesnake Crotalus atrox evolved via a shift in paralog function

Significance This study investigates how rattlesnakes protect themselves from their own evolving venom. Venom toxins can be dangerous if they enter the snake’s tissues or circulatory system during feeding or digestion, and many venomous species have evolved autoresistance to their own toxins. Here, we identify a broad-spectrum serum inhibitor of rattlesnake metalloproteinases, a group of toxins that causes tissue destruction and hemorrhage. We show that this inhibitor is a different member of a protein family known to inhibit metalloproteinases in Asian and South American relatives of rattlesnakes. We suggest that changes in the number and diversity of rattlesnake metalloproteinases selected for the emergence of this broad-spectrum inhibitor, which may be potentially useful in snakebite treatment.


3
The variants were normalized (bcftools norm), filtered by removing adjacent variants within 5 basepairs of each other (bcftools filter -IndelGap 5) and indexed (bcftools index). The indexed variant file was then used to call a reference-based C. atrox fetua consensus genomic sequence (bcftools consensus) (8). Read coverage and gaps in coverage near the fetua gene loci was visualized using Gviz (9)

MDC-4 purification
The purification of MDC4 from C. atrox venom followed the protocol of Williams et al. (10) with the following modifications, 400 mg of C. atrox venom was resuspended in 5 mL of 20 mM Tris-HCl buffer (pH 7.6), filtered through a 0.2 micron (µm).
Durapore syringe filter and applied to a 5 mL HiTrap Q XL Sepharose anion exchange column. Column was washed with 20 mM Tris-HCl buffer (pH 7.6) until UV returned to baseline. Protein elution was performed at a rate of 5 ml/min using 20 mM Tris-HCl (pH 7.6), 1 M NaCl gradient (up to 60%) by an ÄKTA FPLC system (GE Healthcare, UK) over 120 ml as 2.5 ml fractions. The collected fractions were run on SDS-PAGE gels, analyzed by Coomassie staining and fractions with the protein of interest (strong band migrating at ~ 50 kDa) were pooled. The pooled fractions were concentrated using a 10,000 MWCO Amicon Ultra -15 centrifugal filter and applied to a size exclusion chromatography column (HiLoad 16/600 Superdex 75 pg). Protein elution was performed at a rate of 1 ml/min using 20 mM Tris-HCl (pH 7.6) with1 ml fractions collected. The fractions were analyzed by SDS-PAGE and Coomassie staining and those containing strong band at ~50 kDa (MDC-4 and other expressed MDCs) were pooled and concentrated before a second run on the same size exclusion column. Fractions containing the pure protein were pooled, concentrated, and stored as aliquots at -80˚C until further use (Fig S3 C). Protein estimation was performed using Quick Start Bradford 1x Dye Reagent (Bio-Rad) using bovine serum albumin (Thermo Scientific catalog # 23208,) as a standard.

MAD-3 and MPO-1 purification
C. atrox MAD-3 and MPO-1 were purified in a single protocol that used different buffer pH to separate MPO-1 (~24.7 kDa) from proteolytically processed MAD-3 (~19 kDa without disintegrin domain) (Fig S3 B). C. atrox venom (400 mg) was resuspended in 5 ml of 20 mM Bis-Tris buffer (pH 6.5),filtered through a 0.2 µm Durapore syringe filter and loaded onto a 5 ml HiTrap SP Sepharose cation exchange column. Protein elution was performed at a rate of 5 ml/min using 20 mM Bis-Tris buffer (pH 6.5), 1 M NaCl gradient (up to 60%) by an ÄKTA purifier system (GE Healthcare, UK) over 120 mL as Tris-HCl (pH 7.6), 300 mM NaCl before loading on to two size exclusion chromatography columns connected in tandem (HiLoad 16/600 Superdex 75 pg). Elution was performed at 0.5 ml/min and collected as 2.5 ml fractions and those containing protein of interest was pooled and concentrated (Fig. S3C).
The MPO-1 containing fraction (in 20mM Piperazine (pH 5.1)) was concentrated and the sample (5 mL) was applied to a 5 mL HiTrap SP Sepharose cation exchange column equilibrated in the same buffer. The column was washed with 20mM Piperazine (pH 5.1) until no protein was detected by the UV detector and eluted by a linear concentration gradient (0 to 60%) of 20 mM Piperazine (pH 5.1), 1 M NaCl under a flow rate of 0.5 ml/min and temperature of 4 °C. The eluted MPO-1 fractions were dialyzed against 1X PBS concentrated to a 1 ml sample volume and loaded onto two HiLoad 16/600 Superdex 75 pg columns connected in tandem. Protein elution was performed at a rate of 1 ml/min using PBS (pH 7.2) and 2.5 ml fractions were collected. The fractions were analyzed by SDS-PAGE and Coomassie staining and those containing MPO-1 were pooled and concentrated (10,000 MWCO) and stored as aliquots at -80˚C until further use. Protein estimation was performed using Quick Start Bradford 1x Dye Reagent (Bio-Rad) and bovine serum albumin (Thermo Scientific, catalog # 23208) as standards.

Enzymatic "In Gel" Digestion
Coomassie R-250 stained gel pieces were de-stained (50 % methanol (MeOH), 50% water (H20), 100 mM acetic acid (NH4HCO3)) completely, dehydrated (50% acetonitrile (ACN), 50% H20, 25 mM NH4HCO3) for five minutes and then incubated for 30 seconds in 100% ACN. Next, the samples were dried in a Speed-Vac for one minute, reduced in 25 mM DTT (Dithiotreitol in 25mM NH4HCO3) for 15 minutes at 56°C, alkylated with 55 mM CAA (Chloroacetamide in 25 mM NH4HCO3) in darkness at room temperature for 15 minutes, washed once in H2O, dehydrated (50% ACN, 50% H20, 25 mM NH4HCO3) for two minutes in then incubated for 30 seconds in 100% ACN. The samples were dried again and rehydrated with 20 μl of trypsin solution with 0.01% ProteaseMAX™ surfactant (10 ng/μl Trypsin from Promega Corp. in 25 mM NH4HCO3, 0.01% w/v of ProteaseMAX™ from Promega Corp.), let stand for 2 minutes at room temperature then an additional 30 μl of overlay solution (25 mM NH4HCO3, 0.01% w/v of ProteaseMAX™) was added to keep gel pieces immersed throughout the digestion (three hours at 42°C). Peptides generated from digestion were transferred to a new tube and acidified with 2.5% TFA (trifluoroacetic Acid) to a 0.3% final concentration. Gel pieces were extracted further with 70 % ACN, 29.25% H2O, 0.75% TFA for ten minutes while vortexing and solutions combined and dried completely in a Speed-Vac (~15 minutes). Extracted peptides were solubilized in 30 µl of 0.05% TFA. Degraded ProteaseMAX™ was removed via centrifugation (max speed, ten minutes) and the peptides solid-phase extracted (ZipTip® C18 pipette tips from MilliporeSigma) according to the manufacturer's protocol. Peptides were eluted off the C18 SPE column with five microliters of 70% ACN, 30% H2O, 0.1%TFA, dried to completion, then resolubilized in 30 µl total volume with 0.1% formic acid, and two microliters was loaded on the instrument.

NanoLC-MS/MS
Peptides were analyzed by NanoLC-MS/MS using the Agilent 1100 nanoflow system (Agilent) connected to a hybrid linear ion trap-orbitrap mass spectrometer (LTQ-Orbitrap Elite™, Thermo Fisher Scientific) equipped with an EASY-Spray™ electrospray source (held at constant 35ºC). Chromatography of peptides prior to mass spectral analysis was accomplished using capillary emitter column (PepMap® C18, 3µM, 100Å, 150x0.075mm, Thermo Fisher Scientific) onto which two microliters of extracted peptides was automatically loaded. NanoHPLC system delivered solvents A: 0.1% (v/v) formic acid , and B: 99.9% (v/v) acetonitrile, 0.1% (v/v) formic acid at 0.50 µl/min to load the peptides (over a 30 minute period) and 0.3 µl/min to elute peptides directly into the nanoelectrospray with gradual gradient from 0% (v/v) B to 30% (v/v) B over 80 minutes and concluded with a five minute fast gradient from 30% (v/v) B to 50% (v/v) B at which time a four minute flash-out from 50-95% (v/v) B took place. Total run time of 150 minutes encompassed column conditioning at 95% B for one minute and equilibration at 100% A for 30 minutes. As peptides eluted from the HPLC-column/electrospray source survey MS scans were acquired in the Orbitrap with a resolution of 120,000 followed by CID-type MS/MS with 2.0 AMU isolation and 10 millisecond activation time with 35% normalized collision energy fragmentation of 30 most intense peptides detected in the MS1 scan from 350 to 1800 m/z; redundancy was limited by dynamic exclusion. Monoisotopic precursor selection and charge state screening were enabled and +1 and undefined charge states were rejected.

Data analysis and cross-linking assignment
Raw MS/MS data were converted to mgf file format using MSConvert (ProteoWizard: Open Source Software for Rapid Proteomics Tools Development) for downstream analysis. Resulting mgf files were used to search against user defined C. atrox venom proteome or liver proteome (translated transcriptome) amino acid sequence database with a list of common contaminants (172 total entries) using in-house Mascot search engine 2.7.0 (Matrix Science) with variable Methionine oxidation, Asparagine and Glutamine deamidation plus fixed cysteine Carbamidomethylation. Peptide mass tolerance was set at 10 ppm and fragment mass at 0.8 Da. Protein annotations, significance of identification, and spectral based quantification was done with help of Scaffold software (version 4.11.0, Proteome Software Inc., Portland, OR). Peptide identifications were accepted if they could be established at greater than 97.0% probability to achieve a False Discovery Rate (FDR) less than 1.0% by the Scaffold Local FDR algorithm. Protein identifications were accepted if they could be established at greater than 61.0% probability to achieve an FDR less than 1.0% and contained at least two identified peptides. Protein probabilities were assigned by the Protein Prophet algorithm (11). Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony. Proteins sharing significant peptide evidence were grouped into clusters. The fetua locus in C. scutulatus was retrieved on two separate contigs from the C. scutulatus genome and shows at least five fetua genes and two fetub genes (orange arrows).  to the corresponding C. horridus genomic sequence. The presence of these regions in the cDNA sequences supports the conclusion that exon 6 for these genes is present in the genome and their absence in our sequencing data is due to technical issues.